Sparse Principal Component Analysis with Constraints
نویسندگان
چکیده
The sparse principal component analysis is a variant of the classical principal component analysis, which finds linear combinations of a small number of features that maximize variance across data. In this paper we propose a methodology for adding two general types of feature grouping constraints into the original sparse PCA optimization procedure. We derive convex relaxations of the considered constraints, ensuring the convexity of the resulting optimization problem. Empirical evaluation on three real-world problems, one in process monitoring sensor networks and two in social networks, serves to illustrate the usefulness of the proposed methodology. Introduction Sparse Principal Component Analysis (PCA) is an extension to the well-established PCA dimensionality reduction tool, which aims at achieving a reasonable trade-off between the conflicting goals of explaining as much variance as possible using near orthogonal vectors that are constructed from as few features as possible. There are several justifications for using Sparse PCA. First, regular principal components are, in general, combinations of all features and are unlikely to be sparse, thus being difficult to interpret. Sparse PCA greatly improves the relevance and interpretability of the components, and is more likely to reveal the underlying structure of the data. In many real-life applications, the features have a concrete physical meaning (e.g. genes, sensors, people) and interpretability, i.e. feature grouping based on correlation, is an important factor worth sacrificing some of the explained variance. Second, under certain conditions (Zhang and El Ghaoui 2011), sparse components can be computed faster. Third, Sparse PCA provides better statistical regularization. Sparse PCA has been the focus of considerable research in the past decade. The first attempts at improving the interpretation of baseline PCA components were based on post-processing methods such as thresholding (Cadima and Jolliffe 1995) and factor rotation (Jolliffe 1995). Greedy heuristic Sparse PCA formulations have been investigated in (Moghaddam et al. 2006) and (d’Aspremont et al. 2008). More recent methods cast the problem into the optimization framework. Maximizing the explained variance along Copyright c © 2012, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. a normalized vector penalized for the number of non-zero elements of that vector, aims at simultaneous delivery of the aforementioned goals. Majority of these algorithms are based on non-convex formulations, including SPCA (Zou et al. 2006), SCoTLASS (Jolliffe et al. 2003), the Regularized SVD method (Shen and Huang 2008), and the Generalized Power method (Journée et al. 2010). Unlike these approaches, the l1-norm based semidefinite relaxation DSPCA algorithm (d’Aspremont et al. 2007) guarantees global convergence and has been shown to provide better results than other algorithms, i.e. it produces sparser vectors while explaining the same amount of variance. Interpreting Sparse PCA as feature grouping, where each component represents a group and group members correspond to non-zero component elements, we propose an extension of the DSPCA algorithm in which the user is allowed to add several types of constraints regarding the groups’ structure. The idea is to limit the set of feasible solutions by imposing additional goals regarding the components’ structure, which are to be reached simultaneously through optimization. An alternative way of handling these constraints is to post-process the solution by removing component elements to meet the constraints. However, this can lead to significant variance reduction. Unlike this baseline approach, the proposed solution is optimal, as it directly maximizes the variance subject to the constraints. The first type of constraints we consider are the distance constraints. Let us consider an on-street parking problem, where features are on-street parking blocks and examples are hourly occupancies. For purposes of price management, the goal may be to group correlated parking blocks, such that the sums of geographic distances between blocks in a group are less than a specific value. Therefore, non-zero elements of each sparse component must satisfy this requirement. The second type of constraints we consider are the reliability constraints that aim to maximize the overall reliability of the resulting groups. Assuming that each feature (e.g. sensor) has a certain reliability defined by its failure probability and that the entire component becomes temporarily suspended if any of its features fails, the use of features with low reliability can be costly. These constraints are especially important in industrial systems where we wish to group correlated sensors such that the groups are robust, in terms of maintenance. Another example can be found in social netProceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence
منابع مشابه
Sparse Structured Principal Component Analysis and Model Learning for Classification and Quality Detection of Rice Grains
In scientific and commercial fields associated with modern agriculture, the categorization of different rice types and determination of its quality is very important. Various image processing algorithms are applied in recent years to detect different agricultural products. The problem of rice classification and quality detection in this paper is presented based on model learning concepts includ...
متن کاملA New IRIS Segmentation Method Based on Sparse Representation
Iris recognition is one of the most reliable methods for identification. In general, itconsists of image acquisition, iris segmentation, feature extraction and matching. Among them, iris segmentation has an important role on the performance of any iris recognition system. Eyes nonlinear movement, occlusion, and specular reflection are main challenges for any iris segmentation method. In thi...
متن کاملA New IRIS Segmentation Method Based on Sparse Representation
Iris recognition is one of the most reliable methods for identification. In general, itconsists of image acquisition, iris segmentation, feature extraction and matching. Among them, iris segmentation has an important role on the performance of any iris recognition system. Eyes nonlinear movement, occlusion, and specular reflection are main challenges for any iris segmentation method. In thi...
متن کاملSparse principal component analysis for multiblock data and its extension to sparse multiple correspondence analysis
Two new methods to select groups of variables have been developed for multiblock data: ”Group Sparse Principal Component Analysis” (GSPCA) for continuous variables and ”Sparse Multiple Correspondence Analysis” (SMCA) for categorical variables. GSPCA is a compromise between Sparse PCA method of Zou, Hastie and Tibshirani and the method ”group Lasso” of Yuan and Lin. PCA is formulated as a regres...
متن کاملAn algorithm for sparse PCA based on a new sparsity control criterion
Sparse principal component analysis (PCA) imposes extra constraints or penalty terms to the standard PCA to achieve sparsity. In this paper, we first introduce an efficient algorithm for finding a single sparse principal component (PC) with a specified cardinality. Experiments on synthetic data, randomly generated data and real-world data sets show that our algorithm is very fast, especially on...
متن کاملAn efficient algorithm for rank-1 sparse PCA
Sparse principal component analysis (PCA) imposes extra constraints or penalty terms to the original PCA to achieve sparsity. In this paper, we introduce an efficient algorithm to find a single sparse principal component with a specified cardinality. The algorithm consists of two stages. In the first stage, it identifies an active index set with desired cardinality corresponding to the nonzero ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012